-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporary fix for DropTable/CreateTable issue around temporal tables #35225
Conversation
@@ -2606,9 +2606,8 @@ private IReadOnlyList<MigrationOperation> RewriteOperations( | |||
operations.Add(operation); | |||
} | |||
|
|||
// we removed the table, so we no longer need it's temporal information | |||
// there will be no more operations involving this table | |||
temporalTableInformationMap.Remove((tableName, schema)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We made a change last year to reduce the size of temporal table annotations in the model. See ##32239. Will this regress that at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, annotations are added in a different place
@roji the reason for 2 pass approach is that the model represents the state of tables after all the migrations have been applied. There can be a situation where you have a column manipulating operation, say, DropColumn, (which doesn't store temporal annotations) and after it, an operation converting table to temporal. If we do the 1 pass approach, we will attempt processing the drop column operation, we haven't seen any table-based operations yet, so we don't know what is the temporal state for the table. If we at this point consult the model, it will tell us that the table is temporal (because it will be converted to temporal later as part of this migration chain). We would then process drop column operation in a way we do it for temporal - disable versioning, drop column from main table, drop column from history table, re-enable versioning. But in reality, at this point table is still regular, so we should just drop column from it and we are done. The idea is to gather the initial state for each table. Droping and creating same table with different temporal info is fine - we will only record the first one - once there is entry in the dictionary we no longer update anything. You shouldn't be seeing data from the last create table when you process the first one. |
OK - point taken about the model being the target model, and therefore representing a possible future state that isn't (yet) accurate. Though the current way of doing things still seems problematic in various ways - I'd have to think more about it. All this complexity and possible brittleness seems to be there simply because... we're missing the source model when generating the migration no? Wouldn't that be the right solution here, allowing us to know the exact information about any table at the start of the migration? In any case, this 9.0 regression doesn't seem important enough to patch (not in a rush for 9.0.1 in any case), since it's a somewhat contrived case of DropTable/CreateTable in the same migraiton (which we think happens only if users manually edit their migrations). |
If we had "before" model, things would get easier. Although the info gathering step is not the hardest part IMO. Problem is with the processing - and doing it right. There are many moving parts and cases to consider - some operations require disabling period, some require enabling it, some require mirroring to history table, etc. We try to be smart about it and don't constantly disable/re-enable period, but rather only do it when we have to. I'm taking some time during the break to look into making this code more robust. E.g. validating what we think the state should be vs what we actually see in the migration ops. Also added better testing infra so that we can verify all the wonky scenarios. I do agree that the regression is not critical to patch for 9.0.1, but depending on how the work goes, we might want to patch in later versions. |
OK - I won't go deeper into this in any case. I do believe that if having the source model is helpful, we should add it - it's not unreasonable for it to be available during migration generation. |
closing this for now, we don't think it's worth patching this at this moment. Will do a more comprehensive fix/refactoring for this area |
@maumar this is a bug in #32239, which I think is the result of a wider design issue in that PR; the trigger is simply having a DropTable followed by a CreateTable in the same migration. Your second processing loop for temporal tables (code) removes the temporal data from the table when it sees the DropTable, and when we get to the CreateTable we get an exception since there's no data at that point (code).
The hacky fix proposed here - which is probably OK for patching - is to simply not remove the data when we see DropTable, ensuring that it's still there when the CreateTable is processed. We've got two user reports on #35162; am waiting to hear how they got to having DropTable/CreateTable in the same migration in order to better understand how critical this bug is.
However, I think the design here is fundamentally problematic; you can't really do a pass over all operations (and then over the model for missing ones), storing the temporal data in a single dictionary; a single migration can create and drop the same table multiple times, each time with completely different temporal data. The current design flattens all that out, so that when you process e.g. the 1st CreateTable, you actually see the temporal data from the last CreateTable. IMHO the correct design here would be to have a single pass, extracting the temporal data when you see e.g. CreateTable, and remember it in the dictionary. If you have reach a table operation which doesn't contain temporal data, and don't have data in the dictionary (no e.g. CreateTable was present in the migration before), at that point you can consult the model, and again store the data in the dictionary. This way you always have the correct, up-to-date temporal data in the dictionary, as you're advancing in a single pass through all migration operations.
Fixes #35162